{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Python Op Tutorial\n",
    "In this tutorial we cover the `Python` operator that allows writing Caffe2 operators using Python. We'll also discuss some of the underlying implementation details of the operator."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Forward Python Operator\n",
    "\n",
    "Caffe2 provides a high-level interface that helps with creating Python ops. Let's consider the following example, in which we create a basic operator `f`, which outputs the input * 2:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.\n",
      "WARNING:root:Debug message: No module named caffe2_pybind11_state_gpu\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[6.]\n"
     ]
    }
   ],
   "source": [
    "from __future__ import absolute_import\n",
    "from __future__ import division\n",
    "from __future__ import print_function\n",
    "from __future__ import unicode_literals\n",
    "\n",
    "from caffe2.python import core, workspace\n",
    "import numpy as np\n",
    "\n",
    "def f(inputs, outputs):\n",
    "    outputs[0].feed(2 * inputs[0].data)  # use 'feed' to set the output tensor to 2*input\n",
    "\n",
    "workspace.ResetWorkspace()\n",
    "net = core.Net(\"tutorial\")\n",
    "net.Python(f)([\"x\"], [\"y\"])\n",
    "workspace.FeedBlob(\"x\", np.array([3.]))\n",
    "workspace.RunNetOnce(net)\n",
    "print(workspace.FetchBlob(\"y\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As seen in the example, the `net.Python()` function returns a callable that can be used just like any other operator. In this example, we add a new Python operator to the net with input \"x\" and output \"y\". Note that you can save the output of `net.Python()` and call it multiple times to combine multiple Python operators with different inputs and outputs.\n",
    "\n",
    "Let's take a closer look at `net.Python()` function and the corresponding body of a new Python operator `f`. Every time `net.Python(f)` is called, it serializes a given function `f` and saves it in a global registry under a known key (token; passed to a PythonOp as an argument). After this, `net.Python()` returns a lambda that accepts positional and keyword arguments (typically inputs, outputs and extra arguments) and attaches a new Python operator to the net that calls function `f` on a given list of inputs and outputs.\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Python operator's function `f` expects two positional arguments: a list of inputs and a list of outputs. When an operator is executed it transparently converts Caffe2 blobs into the elements of these lists. CPU tensor blobs are converted into `TensorCPU` objects that act as wrappers around Numpy arrays. Let's take a closer look at a relationship between Caffe2 CPU tensor, Python's `TensorCPU` object and a Numpy array:\n",
    "\n",
    "1) Conversion between C++ tensor objects and Numpy objects happens automatically and is handled by PyBind library.\n",
    "\n",
    "2) When generating a `TensorCPU` wrapper, a new Numpy array object is created which **shares** the same memory storage as a corresponding Caffe2 CPU tensor. This Numpy array is accessible in Python as a `.data` property of a `TensorCPU` object.\n",
    "\n",
    "3) Although Numpy array and Caffe2 tensor might share the same storage, other tensor data (e.g. shape) of Caffe2 tensor is stored **separately** from a Numpy array. Furthermore, Numpy may copy and reallocate its array to a different location in memory (e.g. when we try to resize an array) during operator's function execution. It's important to keep that in mind when writing a Python operator's code to ensure that Caffe2 and Numpy output tensors are in sync.\n",
    "\n",
    "4) `TensorCPU`'s `feed` method accepts a Numpy tensor, resizes an underying Caffe2 tensor and copies Numpy's tensor data into a Caffe2 tensor.\n",
    "\n",
    "5) Another way to ensure that Caffe2's output tensor is properly set is to call the `reshape` function on a corresponding `TensorCPU` output, and copy the data in Python to the output's `.data` tensor.\n",
    "\n",
    "Below is an example of ensuring that Caffe2's output tensor is properly configured using `reshape`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[6.]\n"
     ]
    }
   ],
   "source": [
    "def f_reshape(inputs, outputs):\n",
    "    outputs[0].reshape(inputs[0].shape)        # set the output tensor to be the same shape as the input tensor\n",
    "    outputs[0].data[...] = 2 * inputs[0].data  # assign output tensor (Numpy) to 2 * input tensor (Numpy)\n",
    "\n",
    "workspace.ResetWorkspace()\n",
    "net = core.Net(\"tutorial\")\n",
    "net.Python(f_reshape)([\"x\"], [\"z\"])\n",
    "workspace.FeedBlob(\"x\", np.array([3.]))\n",
    "workspace.RunNetOnce(net)\n",
    "print(workspace.FetchBlob(\"z\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This example works correctly because `reshape` method updates an underlying Caffe2 tensor and a subsequent call to the `.data` property returns a Numpy array that shares memory with a Caffe2 tensor. The last line in `f_reshape` copies data into the shared memory location."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There're several additional arguments that `net.Python()` accepts. When `pass_workspace=True` is passed, a workspace is passed to an operator's `Python` function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[6.]\n"
     ]
    }
   ],
   "source": [
    "def f_workspace(inputs, outputs, workspace):\n",
    "    # use 'feed' to set the output tensor to 2 * (a blob in the workspace called \"x\")\n",
    "    outputs[0].feed(2 * workspace.blobs[\"x\"].fetch())\n",
    "\n",
    "workspace.ResetWorkspace()\n",
    "net = core.Net(\"tutorial\")\n",
    "net.Python(f_workspace, pass_workspace=True)([], [\"y\"])  # add Python operator to net without specifying input blob\n",
    "workspace.FeedBlob(\"x\", np.array([3.]))  # manually feed the \"x\" blob the the f_workspace operator expects into the workspace\n",
    "workspace.RunNetOnce(net)\n",
    "print(workspace.FetchBlob(\"y\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Gradient Python Operator\n",
    "\n",
    "Another important `net.Python()` operator to discuss is a gradient operator which corresponds to another custom `net.Python()` operator. In the example below, `grad_f` is a corresponding gradient operator for `f`.\n",
    "\n",
    "Note that we are using the same `f` as before ( $y = 2x$ ). So when we expect the gradient with respect to x: ( $\\frac{dy}{dx} = 2$ )\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2.]\n"
     ]
    }
   ],
   "source": [
    "def f(inputs, outputs):\n",
    "            outputs[0].reshape(inputs[0].shape)\n",
    "            outputs[0].data[...] = inputs[0].data * 2\n",
    "\n",
    "def grad_f(inputs, outputs):\n",
    "    # Ordering of inputs is [fwd inputs, outputs, grad_outputs]\n",
    "    grad_output = inputs[2]\n",
    "    grad_input = outputs[0]\n",
    "    grad_input.reshape(grad_output.shape)\n",
    "    grad_input.data[...] = grad_output.data * 2\n",
    "\n",
    "workspace.ResetWorkspace()\n",
    "net = core.Net(\"tutorial\")\n",
    "net.Python(f, grad_f)([\"x\"], [\"y\"])\n",
    "workspace.FeedBlob(\"x\", np.array([3.]))\n",
    "net.AddGradientOperators([\"y\"])\n",
    "workspace.RunNetOnce(net)\n",
    "print(workspace.FetchBlob(\"x_grad\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When `net.Python()` is called with a gradient function specified, it also registers a serialized gradient function that is used by a corresponding gradient Python operator (**PythonGradient**). This operator executes a gradient function that expects two arguments - input and output lists. \n",
    "\n",
    "- The input list argument contains all forward function inputs, followed by all of its outputs, followed by the gradients of forward function outputs. \n",
    "- The output list contains the gradients of forward function inputs.\n",
    "\n",
    "Note: `net.Python()`'s **grad_output_indices**/**grad_input_indices** allow specifying indices of gradient output/input blobs that the gradient function reads/writes to."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Note on GPU tensors:\n",
    "\n",
    "PythonOp implementation is CPU specific, it uses Numpy arrays that expect CPU memory storage. In order to be able to use a Python operator with GPU tensors, we define a CUDA version of PythonOp using GPUFallbackOp. This operator wraps a CPU-operator and adds GPU-to-CPU (and opposite direction) copy operations. Thus, when using a PythonOp with a CUDA device options, all input CUDA tensors are automatically copied to CPU memory and all CPU output tensors are copied back to GPU."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
